31.2 Google Vertex AI 集成

12 分钟阅读

31.2.1 Vertex AI 概述#

Google Vertex AI 是 Google Cloud 提供的统一机器学习平台,支持通过 API 访问多种基础模型,包括 Anthropic 的 Claude 模型。通过 Vertex AI 使用 Claude Code 可以为企业带来以下优势:

Vertex AI 的优势#

  1. GCP 原生集成:与 Google Cloud IAM、Cloud Audit Logs、Cloud Monitoring 等服务无缝集成
  2. 全球端点:支持全球访问,提供更好的延迟和可用性
  3. 100 万令牌上下文窗口:支持超长上下文,适合大型代码库分析
  4. 企业级安全:符合 Google Cloud 安全标准和合规要求
  5. 灵活的部署:支持区域和全局端点,满足不同需求

适用场景#

  • 已经使用 Google Cloud Platform 的企业
  • 需要超长上下文窗口的应用
  • 要求使用 Google Cloud IAM 进行身份验证的场景
  • 需要全球访问和低延迟的环境

31.2.2 Vertex AI 配置步骤#

1. 前置条件检查#

class VertexAIPrerequisitesChecker: """Vertex AI 前置条件检查器"""

def init(self): self.checks = { 'gcp_project': False, 'vertex_enabled': False, 'model_access': False, 'iam_permissions': False, 'gcloud_configured': False }

def check_all(self) -> PrerequisiteReport: """检查所有前置条件""" report = PrerequisiteReport()

检查 GCP 项目

self.checks['gcp_project'] = self._check_gcp_project()

检查 Vertex AI 是否启用

self.checks['vertex_enabled'] = self._check_vertex_enabled()

检查模型访问权限

self.checks['model_access'] = self._check_model_access()

检查 IAM 权限

self.checks['iam_permissions'] = self._check_iam_permissions()

检查 gcloud 配置

self.checks['gcloud_configured'] = self._check_gcloud_configured()

生成报告

report.checks = self.checks report.all_passed = all(self.checks.values()) report.missing = [ check for check, passed in self.checks.items() if not passed ]

return report

def _check_gcp_project(self) -> bool: """检查 GCP 项目"""

try: result = subprocess.run( ['gcloud', 'config', 'get-value', 'project'], capture_output=True, text=True ) return result.returncode == 0 and result.stdout.strip() except Exception: return False

def _check_vertex_enabled(self) -> bool: """检查 Vertex AI 是否启用""" try: result = subprocess.run( ['gcloud', 'services', 'list', '--enabled'], capture_output=True, text=True ) return 'aiplatform.googleapis.com' in result.stdout except Exception: return False

2. 启用 Vertex AI API#

bash
bash

# 设置项目 ID
gcloud config set project YOUR-PROJECT-ID

# 启用 Vertex AI API
gcloud services enable aiplatform.googleapis.com

# 验证启用
gcloud services list --enabled | grep aiplatform

### 3. 请求模型访问权限

# 通过 gcloud 请求访问
gcloud ai models list \
--region=global \
--filter="displayName~'Claude'"
# 或通过控制台访问
# https://console.cloud.google.com/vertex-ai/model-garden

4. 配置 GCP 凭证#

选项 A:服务账户密钥

bash
bash

# 创建服务账户
gcloud iam service-accounts create claude-code-sa \
  --display-name="Claude Code Service Account"

# 授予必要权限
gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
  --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

# 创建密钥
gcloud iam service-accounts keys create claude-code-key.json \
  --iam-account=claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com

# 设置环境变量
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/claude-code-key.json

#### 选项 B:ADC (Application Default Credentials)

# 使用用户凭证
gcloud auth application-default login
# 或使用服务账户
gcloud auth application-default login \
--impersonate-service-account=claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com

选项 C:工作负载身份联邦

bash
bash

# 配置工作负载身份联邦
gcloud iam workload-identity-pools create-cred-config \
  projects/YOUR-PROJECT-ID/locations/global/workloadIdentityPools/POOL-NAME/providers/PROVIDER-NAME \
  --credential-source-file=credential-config.json \
  --output-file=adc.json

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/adc.json

### 5. 启用 Claude Code Vertex AI 集成

# 启用 Vertex AI
export CLAUDE_CODE_USE_VERTEX=1
# 设置项目 ID
export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID
# 设置区域(全局或特定区域)
export CLOUD_ML_REGION=global
# 可选:为不同模型设置不同区域
export VERTEX_REGION_CLAUDE_3_5_HAIKU=us-east5
export VERTEX_REGION_CLAUDE_3_5_SONNET=us-east5
export VERTEX_REGION_CLAUDE_4_0_OPUS=europe-west1
export VERTEX_REGION_CLAUDE_4_0_SONNET=us-east5
export VERTEX_REGION_CLAUDE_4_1_OPUS=europe-west1

6. 配置模型#

bash
bash

# 主模型
export ANTHROPIC_MODEL='claude-sonnet-4-5@20250929'

# 小型/快速模型
export ANTHROPIC_SMALL_FAST_MODEL='claude-haiku-4-5@20251001'

# 使用 100 万令牌上下文窗口
export ANTHROPIC_MODEL='claude-sonnet-4-5@20250929'
export ANTHROPIC_VERTEX_ENABLE_1M_CONTEXT=1

## 31.2.3 IAM 权限配置

### 基础 IAM 角色

# 使用预定义角色
gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
--member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"

自定义 IAM 角色#

bash
yaml

# custom-role.yaml
title: "Claude Code Vertex AI Role"
description: "Custom role for Claude Code Vertex AI access"
stage: "GA"
includedPermissions:
  - aiplatform.endpoints.predict
  - aiplatform.endpoints.streamPredict
  - aiplatform.models.list
  - aiplatform.models.get

# 创建自定义角色
gcloud iam roles create claude-code-vertex-role \
--project=YOUR-PROJECT-ID \
--file=custom-role.yaml
# 授予自定义角色
gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
--member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
--role="projects/YOUR-PROJECT-ID/roles/claude-code-vertex-role"

组织策略配置#

bash
bash

# 创建组织策略以限制访问
gcloud resource-manager org-policies create \
  --organization=YOUR-ORG-ID \
  --name=restrict-vertex-ai-models \
  --policy-file=vertex-ai-policy.yaml

# vertex-ai-policy.yaml
name: organizations/YOUR-ORG-ID/policies/restrict-vertex-ai-models
spec:
  rules:
  - enforce: true
    values:
      allowedValues:
      - "claude-sonnet-4-5@20250929"
      - "claude-haiku-4-5@20251001"

## 31.2.4 高级配置

### 100 万令牌上下文窗口

class ContextWindowManager:
"""上下文窗口管理器"""
def __init__(self):
self.max_tokens = 1_000_000
self.context_headers = {
'context-1m-2025-08-07': 'true'
}
def enable_extended_context(self) -> Dict[str, str]:
"""启用扩展上下文窗口"""
return {
'model': 'claude-sonnet-4-5@20250929',
'headers': self.context_headers,
'env_vars': {
'ANTHROPIC_VERTEX_ENABLE_1M_CONTEXT': '1'
}
}
def estimate_token_count(self, text: str) -> int:
"""估算令牌数量"""
# 粗略估算:1 token ≈ 4 characters
return len(text) // 4
def check_context_fit(self, text: str) -> bool:
"""检查文本是否适合上下文窗口"""
token_count = self.estimate_token_count(text)
return token_count <= self.max_tokens

区域配置优化#

bash
python

class RegionOptimizer:
    """区域优化器"""

    def __init__(self):
        self.regions = {
            'us-east5': {
                'latency': 50,
                'cost_factor': 1.0,
                'availability': 0.999
            },
            'europe-west1': {
                'latency': 80,
                'cost_factor': 1.1,
                'availability': 0.999
            },
            'asia-southeast1': {
                'latency': 100,
                'cost_factor': 0.9,
                'availability': 0.998
            }
        }

    def select_optimal_region(self,
                             user_location: str,
                             requirements: Dict) -> str:
        """选择最优区域"""
        # 基于用户位置选择
        if 'US' in user_location:
            primary_region = 'us-east5'
        elif 'EU' in user_location:
            primary_region = 'europe-west1'
        else:
            primary_region = 'asia-southeast1'

        # 根据需求调整
        if requirements.get('low_latency', False):
            return primary_region
        elif requirements.get('low_cost', False):
            return min(
                self.regions.items(),
                key=lambda x: x[1]['cost_factor']
            )[0]
        else:
            return primary_region

### 提示缓存配置

# 启用提示缓存(默认启用)
# 在请求中包含 cache_control 标志
# 禁用提示缓存
export DISABLE_PROMPT_CACHING=1

31.2.5 监控和故障排除#

Cloud Monitoring 配置#

bash
python

class VertexAIMonitor:
    """Vertex AI 监控器"""

    def __init__(self):
        self.monitoring_client = monitoring_v3.MetricServiceClient()
        self.project_name = f"projects/{os.getenv('GOOGLE_CLOUD_PROJECT')}"

    def create_dashboard(self):
        """创建监控仪表板"""
        dashboard = {
            "displayName": "Claude Code Vertex AI Dashboard",
            "gridLayout": {
                "widgets": [
                    {
                        "title": "Request Count",
                        "xyChart": {
                            "dataSets": [{
                                "timeSeriesQuery": {
                                    "timeSeriesFilter": {
                                        "filter": "resource.type=\"aiplatform.googleapis.com/Endpoint\"",
                                        "aggregation": {
                                            "alignmentPeriod": "60s",
                                            "perSeriesAligner": "ALIGN_RATE"
                                        }
                                    }
                                }
                            }]
                        }
                    },
                    {
                        "title": "Latency",
                        "xyChart": {
                            "dataSets": [{
                                "timeSeriesQuery": {
                                    "timeSeriesFilter": {
                                        "filter": "metric.type=\"aiplatform.googleapis.com/prediction_latency\"",
                                        "aggregation": {
                                            "alignmentPeriod": "60s",
                                            "perSeriesAligner": "ALIGN_PERCENTILE_99"
                                        }
                                    }
                                }
                            }]
                        }
                    }
                ]
            }
        }

        self.monitoring_client.create_dashboard(
            name=f"{self.project_name}/dashboards/claude-code",
            body=dashboard
        )

### 常见问题解决

class VertexAITroubleshooter:
"""Vertex AI 故障排除器"""
def diagnose(self, error: str) -> DiagnosisResult:
"""诊断问题"""
if 'PermissionDenied' in error:
return self._diagnose_permission_denied()
elif 'ModelNotFound' in error:
return self._diagnose_model_not_found()
elif 'QuotaExceeded' in error:
return self._diagnose_quota_exceeded()
elif 'InvalidArgument' in error:
return self._diagnose_invalid_argument()
else:
return DiagnosisResult(
issue='Unknown',
solution='Check Cloud Logging for details'
)
def _diagnose_permission_denied(self) -> DiagnosisResult:
"""诊断权限拒绝错误"""
return DiagnosisResult(
issue='IAM Permission Denied',
solution='''1. Verify service account has aiplatform.user role

commands=[
'gcloud projects get-iam-policy YOUR-PROJECT-ID',
'gcloud ai models list --region=global'
]
)
def _diagnose_quota_exceeded(self) -> DiagnosisResult:
"""诊断配额超限错误"""
return DiagnosisResult(
issue='Quota Exceeded',
solution='''1. Check current quota in Cloud Console

commands=[
'gcloud compute project-info describe --project=YOUR-PROJECT-ID',
'gcloud ai models list --region=global --filter="displayName~\'Claude\'"'
]
)

日志配置#

bash
bash

# 启用详细日志
export GOOGLE_CLOUD_LOGGING_LEVEL=debug

# 查看日志
gcloud logging read "resource.type=aiplatform.googleapis.com/Endpoint" \
  --project=YOUR-PROJECT-ID \
  --limit=50 \
  --format="table(timestamp,protoPayload.requestId,protoPayload.error)"

通过正确配置 Google Vertex AI,企业可以利用 Google Cloud 的强大基础设施,安全、高效地部署 Claude Code,并享受超长上下文窗口带来的优势。

标记本节教程为已读

记录您的学习进度,方便后续查看。